
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world by JSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.jstor.org/participate-jstor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



83] Notes 765 

NOTES 



A USE FOR TRIGONOMETRIC TABLES IN CORRELATION 

By Holbbook Working, University of Minnesota 



In any problem in par tial, or m ultipl e, correlation, it is necessary to 
find the value of "» 1 — r 2 or Log * 1 — r 2 for several different values of r. 
The same quantity also appears in the formula for the standard error 
of an equation of regression, s y = o- y ^'l—r 2 * 

Where r varies between the limits +1 and — 1, as is always the 
case in these formulae, there is an interesting and useful trigonometric 
relation between r and "* 1— r 2 . Any one familiar with analytical 
geometry will recognize the equation y=*l—r 2 as the equation of a 
circle. There is also a simple geometrical proof of the relation. Draw 
the arc YAX with radius OA = l. Construct AXi perpendicular to 
OXi, giving the right triangle OAX x . By the familiar relation that the 
square of the hypotenuse equals the sum of the squares of the other 
two sides, we have (AXi) 2 = (OA) 2 — (O Xi) 2 . But OA being taken equal 
to unity and OXi = r, we have AXi= v l — r 2 . 

Now it is immediately apparent that AX\ is the sine of angle a 
and that OXi is the cosine of angle a. It follows that if we find the 
angle whos e cosine is r, we need but take the sine of that angle to 
get v 1 — r 2 , or the Log of the sine to get Log *1 — r 2 . The reverse is 
also true. Briefly: 

If r = cos a, sin a = m- r 2 , Log sin a = Log ~v 1— r 2 . 

If r = sin a, cos a = "^1— r 2 , Log cos o = Log_yi— r 2 . 

Knowing these relations, the value of ^1— r 2 or of Log "^1— r 2 may 
quickly be found in any table of the trigonometric functions. The 
accuracy with which these values may be found is very fair. Without 
interpolation a table giving the trigonometric function for each minute 
of arc gives an accuracy sufficient for the purpose of ordinary statis- 
tical investigation. For example, take r = 0.20. The angle (of even 
minutes) whose sine is nearest 0.20 is the angle 11° 32'. Since fractions 
of minutes are neglected, the error in the angle taken may be \ minute, 
but not more than \ minute. The cosine of angle 11° 32' is 0.97981. 
A difference of \ minute in the angle will change the cosine by 0.00003. 

* The notation is that of Yule. Cf. Yule, Introduction to the Theory of Statistics, 5th Edition, p. 177. 



766 



American Statistical Association 



[84 




This is the maximum possible error in the determination of "^1 — r 2 
when r = 0.20. For higher values of r the error is greater. If r = 0.99 
the maximum possible error in the determination of »1- r 2 is 0.00071. 
For r = 0.99, again, the maximum possible error in the determination 
of Log Vi-,.2 i s 0.00044. 

Therefore we may conclude that when a table giving the trigonomet- 
ric functions for each minute of arc is used without interpolation, 
the value obtained for "^1 — r 2 or for Log ^1 — r 2 is accurate to at least 
three decimals. By interpolation, accuracy to at least four decimals 
may be obtained. 

The accompanying table was prepared by the method outlined 
above, but using tables giving the functions to eight decimals for 
each sexagesimal second of arc* In the table values of Log m-i 4 

* The tables used were E. Gifford, Natural Sines, Manchester, 1914, and Bauschinger and Peters, 
Logarithmic-Trigonometrical Tables, Leipzig, 1911. 



85] 



Notes 



767 



for the higher values of r may be in error as much as 0.000,004, but 
not more. The accuracy of the values given for "^1— r 2 is slightly 
better. 



VALUES OF 



l^r* AND LogVl_ r 2 TO SIX DECIMAL PLACES FOE VALUES OF r 
FROM .00 TO .99 FOR EACH CHANGE OF .01 IN r.* 



T 

(sin a( 


Vl_ r 2 


Log Vi_ r 2 


r 


Vl- r ! 


Log Vi_ r 2 


(cos a) 


(Log cos a) 


(sin a) 


(cos a) 


(Log cos a) 


.00 


1 .000 000 


0.000 000 


.50 


.866 025 


9.937 531 


.01 


.999 950 


9.999 978 


.51 


.860 174 


9.934 586 


.02 


.999 800 


913 


.52 


.854 167 


1 543 


.03 


.999 550 


804 


.53 


.847 997 


9.928 394 


.04 


.999 200 


652 


.54 


.841 665 


5 139 


.05 


.998 749 


456 


.55 


.835 165 


1 772 


.06 


.998 198 


217 


.56 


.828 492 


9.918 288 


.07 


.997 547 


9.998 933 


.57 


.821 644 


4 684 


.08 


.996 795 


606 


.58 


.814 616 


953 


.09 


.995 942 


234 


.59 


.807 404 


9.907 091 


.10 


.994 983 


9.997 818 


.60 


.799 999 


3 089 


.11 


.993 932 


357 


.61 


.792 402 


9.898 946 


.12 


.992 774 


6 850 


.62 


.784 602 


4 649 


.13 


.991 551 


299 


.63 


.776 596 


195 


.14 


.990 152 


5 702 


.64 


.768 374 


9.885 572 


.15 


.988 686 


058 


.65 


.759 933 


776 


.16 


.987 117 


4 369 


.66 


.751 264 


9.875 793 


.17 


.985 444 


3 632 


.67 


.742 363 


616 


.18 


.983 667 


2 848 


.68 


.733 212 


9.865 230 


.19 


.981 784 


2 016 


.69 


.723 807 


9.859 623 


.20 


.979 796 


1 136 


.70 


.714 144 


9.853 786 


.21 


.977 702 


206 


.71 


.704 201 


9.847 696 


.22 


.975 499 


9.989 227 


.72 


.693 975 


9.841 344 


.23 


.973 190 


8 198 


.73 


.683 447 


9.834 705 


.24 


.970 772 


7 117 


.74 


.672 507 


9.827 762 


.25 


.968 246 


6 986 


.75 


.661 439 


9.820 490 


.26 


.965 609 


4 801 


.76 


.649 923 


9.812 862 


.27 


.962 861 


3 564 


.77 


.638 044 


9.804 851 


.28 


.960 000 


2 271 


.78 


.625 780 


9.796 422 


.29 


.957 026 


924 


.79 


.613 106 


9.787 536 


.30 


.953 940 


9.979 521 


.80 


.600 001 


9.778 152 


.31 


.950 737 


8 060 


.81 


.586 431 


9.768 217 


.32 


.947 417 


6 541 


.82 


.572 365 


9.757 673 


.33 


.943 980 


4 963 


.83 


.557 765 


9.746 451 


.34 


.940 425 


3 324 


.84 


.542 584 


9.734 467 


.35 


.936 750 


1 624 


.85 


.526 783 


9.721 632 


.36 


.932 952 


9.969 859 


.86 


.510 293 


9.707 819 


.37 


.929 032 


8 031 


.87 


.493 052 


9.693 893 


.38 


.924 987 


6 136 


.88 


.474 972 


9.676 668 


.39 


.920 815 


4 172 


.89 


.455 959 


9.659 926 


.40 


.916 516 


2 140 


.90 


.435 890 


9.639 377 


.41 


.912 086 


036 


.91 


.414 609 


9.617 639 


.42 


.907 523 


9.957 858 


.92 


.391 918 


9.593 195 


.43 


.902 829 


5 606 


.93 


.367 561 


9.565 329 


.44 


.897 998 


3 275 


.94 


.341 173 


9.533 974 


.45 


.893 029 


866 


.95 


.312 252 


9.494 505 


.46 


.887 918 


9.948 373 


.96 


.280 001 


9.447 160 


.47 


.882 667 


5 797 


.97 


.243 107 


9.386 798 


.48 


.877 270 


3 133 


.98 


.198 993 


9.299 847 


.49 


.871 723 


378 


.99 


.141 068 


9.149 429 



* Professor Bruce D. Mudgett has called my attention to a similar table, carried to a smaller number 
of decimals, appearing in a Bulletin of the University of Texas: Tables to Facilitate the Calculation of 
Partial Coefficients of Correlation and Regression Equations, by Truman Lee Kelley. 

A study of the accompanying table results in some interesting con- 
clusions regarding the accuracy with which r ought to be obtained in 
calculating the original correlation. The following relations are found : 
For values of r up to 

0.10 0.45 0.71 0.89 0.95 0.97 



768 American Statistical Association [86 

a change of 0.01 in r changes m_ j-2 less than: 

0.001 0.005 0.01 0.02 0.03 0.04. 

From the above it appears that when r does not exceed 0.71, ^l—r 2 
may be found to at least the same degree of accuracy as r* For values 

of r over 0.71, m-t 1 is not accurate to as many decimals as r, but 
is at least accurate to one decimal le ss tha n r. 

It follows that in order to have M-r 8 accurate to two decimals, 
values of r above 0.7 1 shou ld be calculated to three decimals. 

In order to have M-r 8 accurate to three decimals, the value of r 
must be calculated as follows: 

r not over 0.10 2 decimals suffice 

r not over 0.71 . 3 decimals suffice 

r over 0.71 4 decimals suffice 

Taken in itself, the circumstance that in order to get M-r 8 accu- 
rate to three decimals, r must be calculated to at least four decimals if 
it be larger than 0.71, does not prove that r should be calculated to 
four decimals. It has generally been considered that r should not be 
calculated beyond two or, at the most, three decimals, further accu- 
racy being fictitious. If this assumption be well founded, it follows 

that the value of "* 1—r 2 may not be accurately found beyond one or, 
at the most, two decimals, if r be over 0.71. Which is the correct 
conclusion: that where r is over 0.71, its value should be calculated to 
a larger nu mber of places than for lower values; or that where r is 
over 0.71, ^1 —r 2 cannot be found accurately to as many places as for 
lower values of r? 

The problem may be approac hed b y a consideration of the proper- 
ties of the standard error (o-„ *1- r 2 ), which measures the dispersion 
of the observations about the line of regression. Given the equation 
of regression, the standard error may be calculated quite independ- 
ently of the coefficient of correlation by finding the deviation of each 
observation from the line of regression and calculating the root-mean- 
square of these deviations. The writer sees no reason for considering 
that the standard error can be determined with any greater accuracy 
where the correlation is low than where the correlation is high. But 
if the standard error, which is <r u ' v/ l— r 2 , may be determined with as 
great accuracy for high values of r as for low values, it fol lows, since 
<r y is not affected by the closeness of the correlation, that M-r* may 

* The accu racy r eferred to is, of course, not a question of the method used, but is inherent in the rela- 
tion of r to Vl— r*. 



87] Notes 769 

be determined with as great accuracy for high values of r as for low 
values. If this be true, it follows further that for high values of r 
its value should be calculated to a larger number of decimals than for 
low values. 

Since, as noted above, a table giving the sines and cosines for each 
minute of arc gives ^l-r* accurate to at least three decimal places 
without interpolation, such a table used without interpolation is ade- 
quate where r is calculated to the number of decimals indicated in the 
preceding table, that is, where r is calculated so as to give m-i^ 
accurate to three decimals.* Seldom, if ever, is it desirable to calcu- 
late the value of r to a larger number of decimals; the apparent greater 
accuracy would be fictitious. 



THE GRAPHIC REPRESENTATION OF A FREQUENCY 

DISTRIBUTION 

By Robert E. Chaddock, Columbia University 



In the accompanying diagrams and the explanatory text the purpose 
is two-fold: (1) to present a method of exposition by which the student 
of elementary statistical methods will be enabled to comprehend more 
clearly the nature of the frequency distribution and the assumptions 
underlying grouped data; (2) to correct or clarify the rules for locating 
the median and quartiles which appear in some of the widely used ele- 
mentary text-books. The student is told to add one to the number of 

(n-f-1) 
cases and to divide by two in order to locate the mid-item in 

2 
the array, the value of which is the median average. But if this rule is 
followed, slightly different values are secured for the median, depending 
on whether the calculation is made from the lower toward the higher 
values in the array, or from the higher toward the lower, whereas either 
procedure should yield exactly the same result. The reason is clear. 
One extra item has been inserted. After all, what we seek is not a case 
but a value in an array of items arranged in a regularly ascending or 
descending order of values, which are assumed to be evenly distributed 
over the scale within a given class interval so that progress in one direc- 
tion shows constantly increasing values, and in the other direction con- 
stantly decreasing values. 

* For this purpose the most convenient tables the writer has seen are in G. W. Jones: Logarithmic 
Tables. London: Macmilian, 1898. These tables give the natural and logarithmic functions in parallel 
columns for each minute of arc. 



